# WACVW-2024-Papers

<table>
    <tr>
        <td><strong>Application</strong></td>
        <td>
            <a href="https://huggingface.co/spaces/DmitryRyumin/NewEraAI-Papers" style="float:left;">
                <img src="https://img.shields.io/badge/🤗-NewEraAI--Papers-FFD21F.svg" alt="App" />
            </a>
        </td>
    </tr>
</table>

<div align="center">
    <a href="https://github.com/DmitryRyumin/WACV-2024-Papers/blob/main/sections/2024/workshops/w_iva_q_cv_gai.md">
        <img src="https://cdn.jsdelivr.net/gh/DmitryRyumin/NewEraAI-Papers@main/images/left.svg" width="40" alt="" />
    </a>
    <a href="https://github.com/DmitryRyumin/WACV-2024-Papers/">
        <img src="https://cdn.jsdelivr.net/gh/DmitryRyumin/NewEraAI-Papers@main/images/home.svg" width="40" alt="" />
    </a>
    <a href="https://github.com/DmitryRyumin/WACV-2024-Papers/blob/main/sections/2024/workshops/w_smart_computing_and_internet_of_things_design.md">
        <img src="https://cdn.jsdelivr.net/gh/DmitryRyumin/NewEraAI-Papers@main/images/right.svg" width="40" alt="" />
    </a>
</div>

## Pretraining

![Section Papers](https://img.shields.io/badge/Section%20Papers-15-42BA16) ![Preprint Papers](https://img.shields.io/badge/Preprint%20Papers-9-b31b1b) ![Papers with Open Code](https://img.shields.io/badge/Papers%20with%20Open%20Code-5-1D7FBF) ![Papers with Video](https://img.shields.io/badge/Papers%20with%20Video-0-FF0000)

| **Title** | **Repo** | **Paper** | **Video** |
|-----------|:--------:|:---------:|:---------:|
| [COMEDIAN: Self-Supervised Learning and Knowledge Distillation for Action Spotting using Transformers](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/html/Denize_COMEDIAN_Self-Supervised_Learning_and_Knowledge_Distillation_for_Action_Spotting_Using_WACVW_2024_paper.html) | [![GitHub Page](https://img.shields.io/badge/GitHub-Page-159957.svg)](https://juliendenize.github.io/eztorch/contributions/comedian.html) <br /> [![GitHub](https://img.shields.io/github/stars/juliendenize/eztorch?style=flat)](https://github.com/juliendenize/eztorch) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/papers/Denize_COMEDIAN_Self-Supervised_Learning_and_Knowledge_Distillation_for_Action_Spotting_Using_WACVW_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2309.01270-b31b1b.svg)](http://arxiv.org/abs/2309.01270) | :heavy_minus_sign: |
| [Self-Supervised Pre-Training for Semantic Segmentation in an Indoor Scene](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/html/Shrestha_Self-Supervised_Pre-Training_for_Semantic_Segmentation_in_an_Indoor_Scene_WACVW_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/papers/Shrestha_Self-Supervised_Pre-Training_for_Semantic_Segmentation_in_an_Indoor_Scene_WACVW_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2210.01884-b31b1b.svg)](http://arxiv.org/abs/2210.01884) | :heavy_minus_sign: |
| [E-ViLM: Efficient Video-Language Model via Masked Video Modeling with Semantic Vector-Quantized Tokenizer](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/html/Fang_E-ViLM_Efficient_Video-Language_Model_via_Masked_Video_Modeling_With_Semantic_WACVW_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/papers/Fang_E-ViLM_Efficient_Video-Language_Model_via_Masked_Video_Modeling_With_Semantic_WACVW_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2311.17267-b31b1b.svg)](http://arxiv.org/abs/2311.17267) | :heavy_minus_sign: |
| [Metric Learning for 3D Point Clouds using Optimal Transport](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/html/Katageri_Metric_Learning_for_3D_Point_Clouds_Using_Optimal_Transport_WACVW_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/papers/Katageri_Metric_Learning_for_3D_Point_Clouds_Using_Optimal_Transport_WACVW_2024_paper.pdf) | :heavy_minus_sign: |
| [Does the Fairness of Your Pre-Training Hold Up? Examining the Influence of Pre-Training Techniques on Skin Tone Bias in Skin Lesion Classification](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/html/Seth_Does_the_Fairness_of_Your_Pre-Training_Hold_Up_Examining_the_WACVW_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/ptnv-s/PretrainingImpactOnSkinBias?style=flat)](https://github.com/ptnv-s/PretrainingImpactOnSkinBias) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/papers/Seth_Does_the_Fairness_of_Your_Pre-Training_Hold_Up_Examining_the_WACVW_2024_paper.pdf) | :heavy_minus_sign: |
| [Semi-Supervised Cross-Spectral Face Recognition with Small Datasets](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/html/Nanduri_Semi-Supervised_Cross-Spectral_Face_Recognition_With_Small_Datasets_WACVW_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/papers/Nanduri_Semi-Supervised_Cross-Spectral_Face_Recognition_With_Small_Datasets_WACVW_2024_paper.pdf) | :heavy_minus_sign: |
| [Labeling Indoor Scenes with Fusion of Out-of-the-Box Perception Models](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/html/Li_Labeling_Indoor_Scenes_With_Fusion_of_Out-of-the-Box_Perception_Models_WACVW_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/papers/Li_Labeling_Indoor_Scenes_With_Fusion_of_Out-of-the-Box_Perception_Models_WACVW_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2311.10883-b31b1b.svg)](http://arxiv.org/abs/2311.10883) | :heavy_minus_sign: |
| [RDIR: Capturing Temporally-Invariant Representations of Multiple Objects in Videos](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/html/Zielinski_RDIR_Capturing_Temporally-Invariant_Representations_of_Multiple_Objects_in_Videos_WACVW_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/papers/Zielinski_RDIR_Capturing_Temporally-Invariant_Representations_of_Multiple_Objects_in_Videos_WACVW_2024_paper.pdf) | :heavy_minus_sign: |
| [SLVP: Self-Supervised Language-Video Pre-Training for Referring Video Object Segmentation](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/html/Mei_SLVP_Self-Supervised_Language-Video_Pre-Training_for_Referring_Video_Object_Segmentation_WACVW_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/papers/Mei_SLVP_Self-Supervised_Language-Video_Pre-Training_for_Referring_Video_Object_Segmentation_WACVW_2024_paper.pdf) | :heavy_minus_sign: |
| [How Does Contrastive Learning Organize Images?](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/html/Zhang_How_Does_Contrastive_Learning_Organize_Images_WACVW_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/xsgxlz/How-does-Contrastive-Learning-Organize-Images?style=flat)](https://github.com/xsgxlz/How-does-Contrastive-Learning-Organize-Images) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/papers/Zhang_How_Does_Contrastive_Learning_Organize_Images_WACVW_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2305.10229-b31b1b.svg)](http://arxiv.org/abs/2305.10229) | :heavy_minus_sign: |
| [Zero-Shot Edge Detection with SCESAME: Spectral Clustering-based Ensemble for Segment Anything Model Estimation](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/html/Yamagiwa_Zero-Shot_Edge_Detection_With_SCESAME_Spectral_Clustering-Based_Ensemble_for_Segment_WACVW_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/ymgw55/SCESAME?style=flat)](https://github.com/ymgw55/SCESAME) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/papers/Yamagiwa_Zero-Shot_Edge_Detection_With_SCESAME_Spectral_Clustering-Based_Ensemble_for_Segment_WACVW_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2308.13779-b31b1b.svg)](http://arxiv.org/abs/2308.13779) | :heavy_minus_sign: |
| [Source-Free Domain Adaptation for RGB-D Semantic Segmentation with Vision Transformers](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/html/Rizzoli_Source-Free_Domain_Adaptation_for_RGB-D_Semantic_Segmentation_With_Vision_Transformers_WACVW_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/papers/Rizzoli_Source-Free_Domain_Adaptation_for_RGB-D_Semantic_Segmentation_With_Vision_Transformers_WACVW_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2305.14269-b31b1b.svg)](http://arxiv.org/abs/2305.14269) | :heavy_minus_sign: |
| [Cross-Modal Contrastive Learning with Asymmetric Co-Attention Network for Video Moment Retrieval](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/html/Panta_Cross-Modal_Contrastive_Learning_With_Asymmetric_Co-Attention_Network_for_Video_Moment_WACVW_2024_paper.html) | [![GitHub](https://img.shields.io/github/stars/love481/Cross-modal-Contrastive-Learning-with-Asymmetric-Co-attention-Network-for-Video-Moment-Retrieval?style=flat)](https://github.com/love481/Cross-modal-Contrastive-Learning-with-Asymmetric-Co-attention-Network-for-Video-Moment-Retrieval) | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/papers/Panta_Cross-Modal_Contrastive_Learning_With_Asymmetric_Co-Attention_Network_for_Video_Moment_WACVW_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2312.07435-b31b1b.svg)](http://arxiv.org/abs/2312.07435) | :heavy_minus_sign: |
| [Evaluating Pretrained Models for Deployable Lifelong Learning](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/html/Lekkala_Evaluating_Pretrained_Models_for_Deployable_Lifelong_Learning_WACVW_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/papers/Lekkala_Evaluating_Pretrained_Models_for_Deployable_Lifelong_Learning_WACVW_2024_paper.pdf) <br /> [![arXiv](https://img.shields.io/badge/arXiv-2311.13648-b31b1b.svg)](http://arxiv.org/abs/2311.13648) | :heavy_minus_sign: |
| [A Unified Framework for Cropland Field Boundary Detection and Segmentation](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/html/Rangel_A_Unified_Framework_for_Cropland_Field_Boundary_Detection_and_Segmentation_WACVW_2024_paper.html) | :heavy_minus_sign: | [![thecvf](https://img.shields.io/badge/pdf-thecvf-7395C5.svg)](https://openaccess.thecvf.com/content/WACV2024W/Pretrain/papers/Rangel_A_Unified_Framework_for_Cropland_Field_Boundary_Detection_and_Segmentation_WACVW_2024_paper.pdf) | :heavy_minus_sign: |
